Recent Advances in Information Systems and Technologies by Álvaro Rocha Ana Maria Correia Hojjat Adeli Luís Paulo Reis & Sandra Costanzo

Recent Advances in Information Systems and Technologies by Álvaro Rocha Ana Maria Correia Hojjat Adeli Luís Paulo Reis & Sandra Costanzo

Author:Álvaro Rocha, Ana Maria Correia, Hojjat Adeli, Luís Paulo Reis & Sandra Costanzo
Language: eng
Format: epub
Publisher: Springer International Publishing, Cham


Keywords

SecurityInsider attacksBig dataAuthenticationHadoop

1 Introduction

The exponential growth of data in every aspect of our lives and in enterprises across the world demands to draw value from data. In 2013, five exabytes of data were created each day in various sizes and formats, from sensors, individual archives, social networks, IoT (Internet of Things), and companies [1]. One of the most challenging issues is: how to effectively manage such a large amount of data and identify new ways to analyze large amounts of data to get value from it. Big Data technologies are a step forward in handling this problem. The early version of the big data concept has been described in 2001 in the Gartner report by Laney [2], and big data was defined as large and complex data sets that current computing facilities were not able to handle. It is characterized by 3Vs (Volume, Velocity, and Variety). Additionally, some new Vs have been added by some organizations to further define big data, characteristics as “Veracity”, and “Value” [3] brought more diffusion to the characterization of big data. With the popularity of these systems, the repositories are increasingly likely to be stored with sensitive data and, as usual, we need to secure it properly. There is no skepticism that new frameworks to analyze data can provide a robust foundation for a new generation of analytics and perception, but it is important to consider security before launching or expanding a big data platform. The complexity and variety of these systems must have a comprehensive approach with the security of the entire big data systems [4]. Hadoop systems, by default, are insecure, since customers are deploying them quickly without proper controls, and this can provoke serious errors that can lead to an organizational disaster. Such systems are particularly exposed to insider attacks. The aim of this paper, is to analyze if big data systems administrators are concerned with security and privacy of users system. For this, we show the results of a survey aimed at big data administrators, with some questions that allow us to draw conclusions about the issue of safety in these systems. Additionally, we provide foundation towards the security on big data platforms, and in particular in Apache Hadoop, and show what an insider attacker can do when have access to a network with a non-secure Hadoop cluster. The structure of this paper is organized as follows. The Sect. 2 discusses some related work on security in big data platforms. In Sect. 3 is described the Apache Hadoop platform and its security model. Section 4 presents some attacks that can be performed by an insider user, in a non-secure Hadoop environment. In Sect. 5, we disclose the results of a survey on what platforms big data administrators are working and if security is configured appropriately. Section 6 presents the results of the benchmark tests performed to evaluate the performance impact with the activation of encryption. These results help us to understand the impact of these security measures. Finally, Sect. 7 concludes the paper and proposes future work.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.